Overview

Brought to you by YData

Dataset statistics

Number of variables22
Number of observations1852394
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory310.9 MiB
Average record size in memory176.0 B

Variable types

DateTime2
Numeric9
Text8
Categorical3

Alerts

lat is highly overall correlated with merch_latHigh correlation
long is highly overall correlated with merch_long and 1 other fieldsHigh correlation
merch_lat is highly overall correlated with latHigh correlation
merch_long is highly overall correlated with long and 1 other fieldsHigh correlation
zip is highly overall correlated with long and 1 other fieldsHigh correlation
is_fraud is highly imbalanced (95.3%) Imbalance
amt is highly skewed (γ1 = 40.81280918) Skewed
trans_num has unique values Unique

Reproduction

Analysis started2025-04-30 01:51:43.014341
Analysis finished2025-04-30 01:54:45.445159
Duration3 minutes and 2.43 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

Distinct1819551
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
Minimum2019-01-01 00:00:18
Maximum2020-12-31 23:59:34
Invalid dates0
Invalid dates (%)0.0%
2025-04-30T10:54:45.651047image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:45.886905image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

cc_num
Real number (ℝ)

Distinct999
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.1738604 × 1017
Minimum6.0416207 × 1010
Maximum4.9923464 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:46.103910image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum6.0416207 × 1010
5-th percentile6.3048488 × 1011
Q11.8004295 × 1014
median3.5214173 × 1015
Q34.6422555 × 1015
95-th percentile4.497914 × 1018
Maximum4.9923464 × 1018
Range4.9923463 × 1018
Interquartile range (IQR)4.4622125 × 1015

Descriptive statistics

Standard deviation1.3091153 × 1018
Coefficient of variation (CV)3.1364616
Kurtosis6.1753558
Mean4.1738604 × 1017
Median Absolute Deviation (MAD)3.0764709 × 1015
Skewness2.8510736
Sum5.0088429 × 1018
Variance1.7137828 × 1036
MonotonicityNot monotonic
2025-04-30T10:54:46.316599image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.538441737 × 10154392
 
0.2%
3.02704321 × 10134392
 
0.2%
4.642255475 × 10154386
 
0.2%
6.538891243 × 10154386
 
0.2%
4.364010865 × 10154386
 
0.2%
3.447098678 × 10144385
 
0.2%
6.011438889 × 10154385
 
0.2%
4.586810169 × 10154384
 
0.2%
4.512828415 × 10184384
 
0.2%
4.904681492 × 10154384
 
0.2%
Other values (989) 1808530
97.6%
ValueCountFrequency (%)
6.041620718 × 10102196
0.1%
6.042292873 × 10102200
0.1%
6.042309813 × 1010738
 
< 0.1%
6.042785159 × 1010743
 
< 0.1%
6.048700208 × 1010735
 
< 0.1%
6.04905963 × 10101465
0.1%
6.049559311 × 1010742
 
< 0.1%
5.018029536 × 10112194
0.1%
5.018181333 × 10118
 
< 0.1%
5.018282048 × 1011733
 
< 0.1%
ValueCountFrequency (%)
4.992346398 × 10182922
0.2%
4.989847571 × 10181471
0.1%
4.980323468 × 1018736
 
< 0.1%
4.973530368 × 10181467
0.1%
4.958589672 × 10182191
0.1%
4.95682899 × 10183657
0.2%
4.911818931 × 10189
 
< 0.1%
4.906628656 × 10183655
0.2%
4.897067971 × 10181471
0.1%
4.890424427 × 10182189
0.1%
Distinct693
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:46.611805image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length43
Median length36
Mean length23.130553
Min length13

Characters and Unicode

Total characters42846898
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfraud_Rippin, Kub and Mann
2nd rowfraud_Heller, Gutmann and Zieme
3rd rowfraud_Lind-Buckridge
4th rowfraud_Kutch, Hermiston and Farrell
5th rowfraud_Keeling-Crist
ValueCountFrequency (%)
and 677362
 
15.7%
llc 139662
 
3.2%
inc 131148
 
3.0%
sons 104651
 
2.4%
ltd 100896
 
2.3%
plc 94799
 
2.2%
group 72089
 
1.7%
fraud_kutch 15028
 
0.3%
fraud_schaefer 13367
 
0.3%
fraud_streich 13235
 
0.3%
Other values (804) 2956186
68.5%
2025-04-30T10:54:47.093994image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4158232
 
9.7%
r 3851348
 
9.0%
d 3055994
 
7.1%
e 2665745
 
6.2%
u 2654462
 
6.2%
n 2526397
 
5.9%
2466029
 
5.8%
f 1996096
 
4.7%
_ 1852394
 
4.3%
o 1614017
 
3.8%
Other values (45) 16006184
37.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 42846898
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 4158232
 
9.7%
r 3851348
 
9.0%
d 3055994
 
7.1%
e 2665745
 
6.2%
u 2654462
 
6.2%
n 2526397
 
5.9%
2466029
 
5.8%
f 1996096
 
4.7%
_ 1852394
 
4.3%
o 1614017
 
3.8%
Other values (45) 16006184
37.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 42846898
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 4158232
 
9.7%
r 3851348
 
9.0%
d 3055994
 
7.1%
e 2665745
 
6.2%
u 2654462
 
6.2%
n 2526397
 
5.9%
2466029
 
5.8%
f 1996096
 
4.7%
_ 1852394
 
4.3%
o 1614017
 
3.8%
Other values (45) 16006184
37.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 42846898
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 4158232
 
9.7%
r 3851348
 
9.0%
d 3055994
 
7.1%
e 2665745
 
6.2%
u 2654462
 
6.2%
n 2526397
 
5.9%
2466029
 
5.8%
f 1996096
 
4.7%
_ 1852394
 
4.3%
o 1614017
 
3.8%
Other values (45) 16006184
37.4%

category
Categorical

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
gas_transport
188029 
grocery_pos
176191 
home
175460 
shopping_pos
166463 
kids_pets
161727 
Other values (9)
984524 

Length

Max length14
Median length12
Mean length10.525913
Min length4

Characters and Unicode

Total characters19498139
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmisc_net
2nd rowgrocery_pos
3rd rowentertainment
4th rowgas_transport
5th rowmisc_pos

Common Values

ValueCountFrequency (%)
gas_transport 188029
10.2%
grocery_pos 176191
9.5%
home 175460
9.5%
shopping_pos 166463
9.0%
kids_pets 161727
8.7%
shopping_net 139322
7.5%
entertainment 134118
7.2%
food_dining 130729
 
7.1%
personal_care 130085
 
7.0%
health_fitness 122553
 
6.6%
Other values (4) 327717
17.7%

Length

2025-04-30T10:54:47.281193image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
gas_transport 188029
10.2%
grocery_pos 176191
9.5%
home 175460
9.5%
shopping_pos 166463
9.0%
kids_pets 161727
8.7%
shopping_net 139322
7.5%
entertainment 134118
7.2%
food_dining 130729
 
7.1%
personal_care 130085
 
7.0%
health_fitness 122553
 
6.6%
Other values (4) 327717
17.7%

Most occurring characters

ValueCountFrequency (%)
s 2042254
10.5%
e 1838696
9.4%
o 1758769
9.0%
n 1705118
8.7%
p 1548294
 
7.9%
t 1538055
 
7.9%
_ 1484860
 
7.6%
r 1310440
 
6.7%
i 1190524
 
6.1%
a 950855
 
4.9%
Other values (10) 4130274
21.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 19498139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
s 2042254
10.5%
e 1838696
9.4%
o 1758769
9.0%
n 1705118
8.7%
p 1548294
 
7.9%
t 1538055
 
7.9%
_ 1484860
 
7.6%
r 1310440
 
6.7%
i 1190524
 
6.1%
a 950855
 
4.9%
Other values (10) 4130274
21.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 19498139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
s 2042254
10.5%
e 1838696
9.4%
o 1758769
9.0%
n 1705118
8.7%
p 1548294
 
7.9%
t 1538055
 
7.9%
_ 1484860
 
7.6%
r 1310440
 
6.7%
i 1190524
 
6.1%
a 950855
 
4.9%
Other values (10) 4130274
21.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 19498139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
s 2042254
10.5%
e 1838696
9.4%
o 1758769
9.0%
n 1705118
8.7%
p 1548294
 
7.9%
t 1538055
 
7.9%
_ 1484860
 
7.6%
r 1310440
 
6.7%
i 1190524
 
6.1%
a 950855
 
4.9%
Other values (10) 4130274
21.2%

amt
Real number (ℝ)

Skewed 

Distinct60616
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean70.063567
Minimum1
Maximum28948.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:47.460706image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.44
Q19.64
median47.45
Q383.1
95-th percentile195.34
Maximum28948.9
Range28947.9
Interquartile range (IQR)73.46

Descriptive statistics

Standard deviation159.25397
Coefficient of variation (CV)2.2729927
Kurtosis4181.9073
Mean70.063567
Median Absolute Deviation (MAD)37.46
Skewness40.812809
Sum1.2978533 × 108
Variance25361.828
MonotonicityNot monotonic
2025-04-30T10:54:47.671055image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.14 779
 
< 0.1%
1.1 745
 
< 0.1%
1.04 744
 
< 0.1%
1.08 741
 
< 0.1%
1.2 737
 
< 0.1%
1.25 737
 
< 0.1%
1.02 736
 
< 0.1%
1.01 735
 
< 0.1%
1.22 727
 
< 0.1%
1.03 726
 
< 0.1%
Other values (60606) 1844987
99.6%
ValueCountFrequency (%)
1 332
< 0.1%
1.01 735
< 0.1%
1.02 736
< 0.1%
1.03 726
< 0.1%
1.04 744
< 0.1%
1.05 721
< 0.1%
1.06 671
< 0.1%
1.07 723
< 0.1%
1.08 741
< 0.1%
1.09 720
< 0.1%
ValueCountFrequency (%)
28948.9 1
< 0.1%
27390.12 1
< 0.1%
27119.77 1
< 0.1%
26544.12 1
< 0.1%
25086.94 1
< 0.1%
22768.11 1
< 0.1%
21437.71 1
< 0.1%
19364.91 1
< 0.1%
17897.24 1
< 0.1%
16837.08 1
< 0.1%

first
Text

Distinct355
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:47.990502image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length9
Mean length6.0802977
Min length3

Characters and Unicode

Total characters11263107
Distinct characters49
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJennifer
2nd rowStephanie
3rd rowEdward
4th rowJeremy
5th rowTyler
ValueCountFrequency (%)
christopher 38112
 
2.1%
robert 30743
 
1.7%
jessica 29236
 
1.6%
david 28564
 
1.5%
michael 28539
 
1.5%
james 28496
 
1.5%
jennifer 24181
 
1.3%
john 23445
 
1.3%
mary 23424
 
1.3%
william 23396
 
1.3%
Other values (345) 1574258
85.0%
2025-04-30T10:54:48.506492image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1438618
 
12.8%
e 1230164
 
10.9%
i 883628
 
7.8%
n 877668
 
7.8%
r 867952
 
7.7%
l 554750
 
4.9%
h 493347
 
4.4%
s 463151
 
4.1%
t 444904
 
4.0%
o 384330
 
3.4%
Other values (39) 3624595
32.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11263107
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 1438618
 
12.8%
e 1230164
 
10.9%
i 883628
 
7.8%
n 877668
 
7.8%
r 867952
 
7.7%
l 554750
 
4.9%
h 493347
 
4.4%
s 463151
 
4.1%
t 444904
 
4.0%
o 384330
 
3.4%
Other values (39) 3624595
32.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11263107
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 1438618
 
12.8%
e 1230164
 
10.9%
i 883628
 
7.8%
n 877668
 
7.8%
r 867952
 
7.7%
l 554750
 
4.9%
h 493347
 
4.4%
s 463151
 
4.1%
t 444904
 
4.0%
o 384330
 
3.4%
Other values (39) 3624595
32.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11263107
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 1438618
 
12.8%
e 1230164
 
10.9%
i 883628
 
7.8%
n 877668
 
7.8%
r 867952
 
7.7%
l 554750
 
4.9%
h 493347
 
4.4%
s 463151
 
4.1%
t 444904
 
4.0%
o 384330
 
3.4%
Other values (39) 3624595
32.2%

last
Text

Distinct486
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:48.843233image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length10
Mean length6.1123751
Min length2

Characters and Unicode

Total characters11322527
Distinct characters48
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBanks
2nd rowGill
3rd rowSanchez
4th rowWhite
5th rowGarcia
ValueCountFrequency (%)
smith 40940
 
2.2%
williams 33661
 
1.8%
davis 31434
 
1.7%
johnson 28590
 
1.5%
rodriguez 24879
 
1.3%
martinez 21246
 
1.1%
jones 19825
 
1.1%
lewis 18293
 
1.0%
miller 16821
 
0.9%
gonzalez 16809
 
0.9%
Other values (476) 1599896
86.4%
2025-04-30T10:54:49.370335image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1122673
 
9.9%
r 941641
 
8.3%
a 926704
 
8.2%
n 869662
 
7.7%
o 832319
 
7.4%
l 698286
 
6.2%
s 696904
 
6.2%
i 622878
 
5.5%
t 412730
 
3.6%
h 327959
 
2.9%
Other values (38) 3870771
34.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11322527
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1122673
 
9.9%
r 941641
 
8.3%
a 926704
 
8.2%
n 869662
 
7.7%
o 832319
 
7.4%
l 698286
 
6.2%
s 696904
 
6.2%
i 622878
 
5.5%
t 412730
 
3.6%
h 327959
 
2.9%
Other values (38) 3870771
34.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11322527
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1122673
 
9.9%
r 941641
 
8.3%
a 926704
 
8.2%
n 869662
 
7.7%
o 832319
 
7.4%
l 698286
 
6.2%
s 696904
 
6.2%
i 622878
 
5.5%
t 412730
 
3.6%
h 327959
 
2.9%
Other values (38) 3870771
34.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11322527
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1122673
 
9.9%
r 941641
 
8.3%
a 926704
 
8.2%
n 869662
 
7.7%
o 832319
 
7.4%
l 698286
 
6.2%
s 696904
 
6.2%
i 622878
 
5.5%
t 412730
 
3.6%
h 327959
 
2.9%
Other values (38) 3870771
34.2%

gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
F
1014749 
M
837645 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1852394
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowF
3rd rowM
4th rowM
5th rowM

Common Values

ValueCountFrequency (%)
F 1014749
54.8%
M 837645
45.2%

Length

2025-04-30T10:54:49.570013image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-30T10:54:49.727947image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
f 1014749
54.8%
m 837645
45.2%

Most occurring characters

ValueCountFrequency (%)
F 1014749
54.8%
M 837645
45.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1852394
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
F 1014749
54.8%
M 837645
45.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1852394
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
F 1014749
54.8%
M 837645
45.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1852394
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
F 1014749
54.8%
M 837645
45.2%

street
Text

Distinct999
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:50.021024image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length29
Mean length22.231289
Min length12

Characters and Unicode

Total characters41181107
Distinct characters62
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row561 Perry Cove
2nd row43039 Riley Greens Suite 393
3rd row594 White Dale Suite 530
4th row9443 Cynthia Court Apt. 038
5th row408 Bradley Rest
ValueCountFrequency (%)
apt 468297
 
6.4%
suite 437016
 
5.9%
island 32903
 
0.4%
michael 27058
 
0.4%
islands 25611
 
0.3%
station 25602
 
0.3%
common 25585
 
0.3%
david 24853
 
0.3%
brooks 24143
 
0.3%
fields 23400
 
0.3%
Other values (1959) 6253340
84.9%
2025-04-30T10:54:50.611470image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5515414
 
13.4%
e 2561201
 
6.2%
a 2077034
 
5.0%
i 1851621
 
4.5%
t 1782137
 
4.3%
r 1576757
 
3.8%
n 1523518
 
3.7%
s 1476954
 
3.6%
l 1270600
 
3.1%
o 1251043
 
3.0%
Other values (52) 20294828
49.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 41181107
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
5515414
 
13.4%
e 2561201
 
6.2%
a 2077034
 
5.0%
i 1851621
 
4.5%
t 1782137
 
4.3%
r 1576757
 
3.8%
n 1523518
 
3.7%
s 1476954
 
3.6%
l 1270600
 
3.1%
o 1251043
 
3.0%
Other values (52) 20294828
49.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 41181107
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
5515414
 
13.4%
e 2561201
 
6.2%
a 2077034
 
5.0%
i 1851621
 
4.5%
t 1782137
 
4.3%
r 1576757
 
3.8%
n 1523518
 
3.7%
s 1476954
 
3.6%
l 1270600
 
3.1%
o 1251043
 
3.0%
Other values (52) 20294828
49.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 41181107
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
5515414
 
13.4%
e 2561201
 
6.2%
a 2077034
 
5.0%
i 1851621
 
4.5%
t 1782137
 
4.3%
r 1576757
 
3.8%
n 1523518
 
3.7%
s 1476954
 
3.6%
l 1270600
 
3.1%
o 1251043
 
3.0%
Other values (52) 20294828
49.3%

city
Text

Distinct906
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:50.958129image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length21
Mean length8.6526209
Min length3

Characters and Unicode

Total characters16028063
Distinct characters52
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMoravian Falls
2nd rowOrient
3rd rowMalad City
4th rowBoulder
5th rowDoe Hill
ValueCountFrequency (%)
city 30780
 
1.3%
west 27847
 
1.2%
saint 20483
 
0.9%
north 20472
 
0.9%
falls 18286
 
0.8%
new 16857
 
0.7%
mount 16098
 
0.7%
lake 16089
 
0.7%
san 14638
 
0.6%
springs 12414
 
0.5%
Other values (929) 2118136
91.6%
2025-04-30T10:54:51.835466image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1555978
 
9.7%
a 1334959
 
8.3%
n 1173952
 
7.3%
o 1168590
 
7.3%
l 1115539
 
7.0%
r 1070587
 
6.7%
i 1007053
 
6.3%
t 855511
 
5.3%
s 637587
 
4.0%
459706
 
2.9%
Other values (42) 5648601
35.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 16028063
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1555978
 
9.7%
a 1334959
 
8.3%
n 1173952
 
7.3%
o 1168590
 
7.3%
l 1115539
 
7.0%
r 1070587
 
6.7%
i 1007053
 
6.3%
t 855511
 
5.3%
s 637587
 
4.0%
459706
 
2.9%
Other values (42) 5648601
35.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 16028063
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1555978
 
9.7%
a 1334959
 
8.3%
n 1173952
 
7.3%
o 1168590
 
7.3%
l 1115539
 
7.0%
r 1070587
 
6.7%
i 1007053
 
6.3%
t 855511
 
5.3%
s 637587
 
4.0%
459706
 
2.9%
Other values (42) 5648601
35.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 16028063
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1555978
 
9.7%
a 1334959
 
8.3%
n 1173952
 
7.3%
o 1168590
 
7.3%
l 1115539
 
7.0%
r 1070587
 
6.7%
i 1007053
 
6.3%
t 855511
 
5.3%
s 637587
 
4.0%
459706
 
2.9%
Other values (42) 5648601
35.2%

state
Text

Distinct51
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:52.044141image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters3704788
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNC
2nd rowWA
3rd rowID
4th rowMT
5th rowVA
ValueCountFrequency (%)
tx 135269
 
7.3%
ny 119419
 
6.4%
pa 114173
 
6.2%
ca 80495
 
4.3%
oh 66627
 
3.6%
mi 65825
 
3.6%
il 62212
 
3.4%
fl 60775
 
3.3%
al 58521
 
3.2%
mo 54904
 
3.0%
Other values (41) 1034174
55.8%
2025-04-30T10:54:52.395254image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 508580
13.7%
N 406389
 
11.0%
M 314756
 
8.5%
I 260547
 
7.0%
T 220136
 
5.9%
L 211461
 
5.7%
O 205755
 
5.6%
C 201235
 
5.4%
Y 188176
 
5.1%
X 135269
 
3.7%
Other values (14) 1052484
28.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3704788
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A 508580
13.7%
N 406389
 
11.0%
M 314756
 
8.5%
I 260547
 
7.0%
T 220136
 
5.9%
L 211461
 
5.7%
O 205755
 
5.6%
C 201235
 
5.4%
Y 188176
 
5.1%
X 135269
 
3.7%
Other values (14) 1052484
28.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3704788
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A 508580
13.7%
N 406389
 
11.0%
M 314756
 
8.5%
I 260547
 
7.0%
T 220136
 
5.9%
L 211461
 
5.7%
O 205755
 
5.6%
C 201235
 
5.4%
Y 188176
 
5.1%
X 135269
 
3.7%
Other values (14) 1052484
28.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3704788
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A 508580
13.7%
N 406389
 
11.0%
M 314756
 
8.5%
I 260547
 
7.0%
T 220136
 
5.9%
L 211461
 
5.7%
O 205755
 
5.6%
C 201235
 
5.4%
Y 188176
 
5.1%
X 135269
 
3.7%
Other values (14) 1052484
28.4%

zip
Real number (ℝ)

High correlation 

Distinct985
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48813.258
Minimum1257
Maximum99921
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:52.582829image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum1257
5-th percentile7208
Q126237
median48174
Q372042
95-th percentile94569
Maximum99921
Range98664
Interquartile range (IQR)45805

Descriptive statistics

Standard deviation26881.846
Coefficient of variation (CV)0.55070788
Kurtosis-1.0960542
Mean48813.258
Median Absolute Deviation (MAD)23068
Skewness0.078949647
Sum9.0421387 × 1010
Variance7.2263364 × 108
MonotonicityNot monotonic
2025-04-30T10:54:52.773260image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
82514 5116
 
0.3%
73754 5116
 
0.3%
48088 5115
 
0.3%
34112 5108
 
0.3%
16114 4392
 
0.2%
61454 4392
 
0.2%
72476 4386
 
0.2%
84540 4386
 
0.2%
89512 4386
 
0.2%
72042 4385
 
0.2%
Other values (975) 1805612
97.5%
ValueCountFrequency (%)
1257 2923
0.2%
1330 1466
0.1%
1535 734
 
< 0.1%
1545 1468
0.1%
1612 738
 
< 0.1%
1843 3652
0.2%
1844 2919
0.2%
2180 738
 
< 0.1%
2630 2924
0.2%
2908 745
 
< 0.1%
ValueCountFrequency (%)
99921 14
 
< 0.1%
99783 2203
0.1%
99747 12
 
< 0.1%
99746 734
 
< 0.1%
99323 3651
0.2%
99160 4362
0.2%
99116 15
 
< 0.1%
99113 1463
 
0.1%
99033 3646
0.2%
98836 740
 
< 0.1%

lat
Real number (ℝ)

High correlation 

Distinct983
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.539311
Minimum20.0271
Maximum66.6933
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:52.968124image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum20.0271
5-th percentile29.8826
Q134.6689
median39.3543
Q341.9404
95-th percentile45.8433
Maximum66.6933
Range46.6662
Interquartile range (IQR)7.2715

Descriptive statistics

Standard deviation5.0714704
Coefficient of variation (CV)0.13159214
Kurtosis0.79107707
Mean38.539311
Median Absolute Deviation (MAD)3.3597
Skewness-0.19199899
Sum71389988
Variance25.719812
MonotonicityNot monotonic
2025-04-30T10:54:53.165914image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
43.0048 5116
 
0.3%
36.385 5116
 
0.3%
42.5164 5115
 
0.3%
26.1184 5108
 
0.3%
40.6761 4392
 
0.2%
41.3851 4392
 
0.2%
36.0244 4386
 
0.2%
39.5483 4386
 
0.2%
38.9999 4386
 
0.2%
27.4703 4385
 
0.2%
Other values (973) 1805612
97.5%
ValueCountFrequency (%)
20.0271 2186
0.1%
20.0827 1463
 
0.1%
24.6557 3655
0.2%
26.1184 5108
0.3%
26.3304 741
 
< 0.1%
26.3771 732
 
< 0.1%
26.4215 4362
0.2%
26.4722 3650
0.2%
26.529 2202
0.1%
26.6939 1467
 
0.1%
ValueCountFrequency (%)
66.6933 12
 
< 0.1%
65.6899 734
 
< 0.1%
64.7556 2203
0.1%
55.4732 14
 
< 0.1%
48.8878 4362
0.2%
48.8856 2909
0.2%
48.8328 2200
0.1%
48.6669 1469
 
0.1%
48.6031 4376
0.2%
48.4786 2916
0.2%

long
Real number (ℝ)

High correlation 

Distinct983
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-90.227832
Minimum-165.6723
Maximum-67.9503
Zeros0
Zeros (%)0.0%
Negative1852394
Negative (%)100.0%
Memory size14.1 MiB
2025-04-30T10:54:53.348187image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum-165.6723
5-th percentile-119.0825
Q1-96.798
median-87.4769
Q3-80.158
95-th percentile-73.5365
Maximum-67.9503
Range97.722
Interquartile range (IQR)16.64

Descriptive statistics

Standard deviation13.747895
Coefficient of variation (CV)-0.15236867
Kurtosis1.8375586
Mean-90.227832
Median Absolute Deviation (MAD)8.1527
Skewness-1.1469188
Sum-1.671375 × 108
Variance189.00461
MonotonicityNot monotonic
2025-04-30T10:54:53.575940image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-108.8964 5116
 
0.3%
-98.0727 5116
 
0.3%
-82.9832 5115
 
0.3%
-81.7361 5108
 
0.3%
-80.1752 4392
 
0.2%
-91.0391 4392
 
0.2%
-82.7243 4391
 
0.2%
-109.615 4386
 
0.2%
-119.7957 4386
 
0.2%
-90.9288 4386
 
0.2%
Other values (973) 1805606
97.5%
ValueCountFrequency (%)
-165.6723 2203
0.1%
-156.292 734
 
< 0.1%
-155.488 1463
0.1%
-155.3697 2186
0.1%
-153.994 12
 
< 0.1%
-133.1171 14
 
< 0.1%
-124.4409 1467
0.1%
-124.2174 2195
0.1%
-124.1587 1465
0.1%
-124.1437 2198
0.1%
ValueCountFrequency (%)
-67.9503 2922
0.2%
-68.5565 1467
 
0.1%
-69.2675 743
 
< 0.1%
-69.4828 2931
0.2%
-69.9576 737
 
< 0.1%
-69.9656 4374
0.2%
-70.1031 9
 
< 0.1%
-70.239 1455
 
0.1%
-70.3001 2924
0.2%
-70.3457 2196
0.1%

city_pop
Real number (ℝ)

Distinct891
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean88643.675
Minimum23
Maximum2906700
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:53.803959image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum23
5-th percentile139
Q1741
median2443
Q320328
95-th percentile525713
Maximum2906700
Range2906677
Interquartile range (IQR)19587

Descriptive statistics

Standard deviation301487.62
Coefficient of variation (CV)3.4011182
Kurtosis37.572846
Mean88643.675
Median Absolute Deviation (MAD)2188
Skewness5.5908046
Sum1.6420301 × 1011
Variance9.0894784 × 1010
MonotonicityNot monotonic
2025-04-30T10:54:54.020163image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
606 8049
 
0.4%
1595797 7312
 
0.4%
1312922 7297
 
0.4%
241 6578
 
0.4%
1766 6556
 
0.4%
2906700 5865
 
0.3%
302 5853
 
0.3%
198 5850
 
0.3%
276002 5849
 
0.3%
1126 5841
 
0.3%
Other values (881) 1787344
96.5%
ValueCountFrequency (%)
23 2915
0.2%
37 1469
 
0.1%
43 2920
0.2%
46 4386
0.2%
47 734
 
< 0.1%
49 1472
 
0.1%
51 1470
 
0.1%
52 740
 
< 0.1%
53 3660
0.2%
60 1472
 
0.1%
ValueCountFrequency (%)
2906700 5865
0.3%
2504700 2929
0.2%
2383912 737
 
< 0.1%
1595797 7312
0.4%
1577385 3680
0.2%
1526206 5113
0.3%
1417793 8
 
< 0.1%
1382480 2913
 
0.2%
1312922 7297
0.4%
1263321 5141
0.3%

job
Text

Distinct497
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:54.351988image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length59
Median length38
Mean length20.232398
Min length3

Characters and Unicode

Total characters37478372
Distinct characters53
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPsychologist, counselling
2nd rowSpecial educational needs teacher
3rd rowNature conservation officer
4th rowPatent attorney
5th rowDance movement psychotherapist
ValueCountFrequency (%)
engineer 188048
 
4.6%
officer 158202
 
3.8%
manager 87837
 
2.1%
scientist 79740
 
1.9%
designer 74639
 
1.8%
surveyor 70288
 
1.7%
teacher 54865
 
1.3%
psychologist 46856
 
1.1%
research 42426
 
1.0%
editor 40958
 
1.0%
Other values (457) 3270295
79.5%
2025-04-30T10:54:54.881132image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 4003951
 
10.7%
i 3407729
 
9.1%
r 3140909
 
8.4%
a 2593110
 
6.9%
t 2547852
 
6.8%
n 2521475
 
6.7%
2261760
 
6.0%
o 2133314
 
5.7%
s 2064644
 
5.5%
c 1890653
 
5.0%
Other values (43) 10912975
29.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 37478372
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 4003951
 
10.7%
i 3407729
 
9.1%
r 3140909
 
8.4%
a 2593110
 
6.9%
t 2547852
 
6.8%
n 2521475
 
6.7%
2261760
 
6.0%
o 2133314
 
5.7%
s 2064644
 
5.5%
c 1890653
 
5.0%
Other values (43) 10912975
29.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 37478372
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 4003951
 
10.7%
i 3407729
 
9.1%
r 3140909
 
8.4%
a 2593110
 
6.9%
t 2547852
 
6.8%
n 2521475
 
6.7%
2261760
 
6.0%
o 2133314
 
5.7%
s 2064644
 
5.5%
c 1890653
 
5.0%
Other values (43) 10912975
29.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 37478372
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 4003951
 
10.7%
i 3407729
 
9.1%
r 3140909
 
8.4%
a 2593110
 
6.9%
t 2547852
 
6.8%
n 2521475
 
6.7%
2261760
 
6.0%
o 2133314
 
5.7%
s 2064644
 
5.5%
c 1890653
 
5.0%
Other values (43) 10912975
29.1%

dob
Date

Distinct984
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
Minimum1924-10-30 00:00:00
Maximum2005-01-29 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-04-30T10:54:55.063365image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:55.276656image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

trans_num
Text

Unique 

Distinct1852394
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:56.995703image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters59276608
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1852394 ?
Unique (%)100.0%

Sample

1st row0b242abb623afc578575680df30655b9
2nd row1f76529f8574734946361c461b024d99
3rd rowa1a22d70485983eac12b5b88dad1cf95
4th row6b849c168bdad6f867558c3793159a81
5th rowa41d7549acf90789359a9aa5346dcb46
ValueCountFrequency (%)
f12cf52be2175703db789a4644c32f25 1
 
< 0.1%
1765bb45b3aa3224b4cdcb6e7a96cee3 1
 
< 0.1%
0b242abb623afc578575680df30655b9 1
 
< 0.1%
1f76529f8574734946361c461b024d99 1
 
< 0.1%
a1a22d70485983eac12b5b88dad1cf95 1
 
< 0.1%
6b849c168bdad6f867558c3793159a81 1
 
< 0.1%
a41d7549acf90789359a9aa5346dcb46 1
 
< 0.1%
189a841a0a8ba03058526bcfe566aab5 1
 
< 0.1%
83ec1cc84142af6e2acf10c44949e720 1
 
< 0.1%
6d294ed2cc447d2c71c7171a3d54967c 1
 
< 0.1%
Other values (1852384) 1852384
> 99.9%
2025-04-30T10:54:58.705300image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 3708557
 
6.3%
4 3707696
 
6.3%
7 3707599
 
6.3%
2 3707045
 
6.3%
3 3706132
 
6.3%
1 3705118
 
6.3%
d 3704966
 
6.3%
a 3704452
 
6.2%
8 3704258
 
6.2%
c 3703707
 
6.2%
Other values (6) 22217078
37.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 59276608
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
9 3708557
 
6.3%
4 3707696
 
6.3%
7 3707599
 
6.3%
2 3707045
 
6.3%
3 3706132
 
6.3%
1 3705118
 
6.3%
d 3704966
 
6.3%
a 3704452
 
6.2%
8 3704258
 
6.2%
c 3703707
 
6.2%
Other values (6) 22217078
37.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 59276608
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
9 3708557
 
6.3%
4 3707696
 
6.3%
7 3707599
 
6.3%
2 3707045
 
6.3%
3 3706132
 
6.3%
1 3705118
 
6.3%
d 3704966
 
6.3%
a 3704452
 
6.2%
8 3704258
 
6.2%
c 3703707
 
6.2%
Other values (6) 22217078
37.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 59276608
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
9 3708557
 
6.3%
4 3707696
 
6.3%
7 3707599
 
6.3%
2 3707045
 
6.3%
3 3706132
 
6.3%
1 3705118
 
6.3%
d 3704966
 
6.3%
a 3704452
 
6.2%
8 3704258
 
6.2%
c 3703707
 
6.2%
Other values (6) 22217078
37.5%

unix_time
Real number (ℝ)

Distinct1819583
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3586742 × 109
Minimum1.325376 × 109
Maximum1.3885344 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:58.915762image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum1.325376 × 109
5-th percentile1.3300982 × 109
Q11.3430168 × 109
median1.3570893 × 109
Q31.3745815 × 109
95-th percentile1.3867821 × 109
Maximum1.3885344 × 109
Range63158356
Interquartile range (IQR)31564662

Descriptive statistics

Standard deviation18195081
Coefficient of variation (CV)0.013391791
Kurtosis-1.1995793
Mean1.3586742 × 109
Median Absolute Deviation (MAD)15789076
Skewness-0.019735681
Sum2.5168 × 1015
Variance3.3106099 × 1014
MonotonicityIncreasing
2025-04-30T10:54:59.113055image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1335110521 4
 
< 0.1%
1370050667 4
 
< 0.1%
1370177227 4
 
< 0.1%
1381001869 4
 
< 0.1%
1386957227 4
 
< 0.1%
1387312599 4
 
< 0.1%
1387468942 4
 
< 0.1%
1355636572 3
 
< 0.1%
1354595115 3
 
< 0.1%
1336836798 3
 
< 0.1%
Other values (1819573) 1852357
> 99.9%
ValueCountFrequency (%)
1325376018 1
< 0.1%
1325376044 1
< 0.1%
1325376051 1
< 0.1%
1325376076 1
< 0.1%
1325376186 1
< 0.1%
1325376248 1
< 0.1%
1325376282 1
< 0.1%
1325376308 1
< 0.1%
1325376318 1
< 0.1%
1325376361 1
< 0.1%
ValueCountFrequency (%)
1388534374 1
< 0.1%
1388534364 1
< 0.1%
1388534355 1
< 0.1%
1388534349 1
< 0.1%
1388534347 1
< 0.1%
1388534314 1
< 0.1%
1388534284 1
< 0.1%
1388534276 1
< 0.1%
1388534270 1
< 0.1%
1388534238 1
< 0.1%

merch_lat
Real number (ℝ)

High correlation 

Distinct1754157
Distinct (%)94.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.538976
Minimum19.027422
Maximum67.510267
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-30T10:54:59.553009image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum19.027422
5-th percentile29.753795
Q134.740122
median39.3689
Q341.956263
95-th percentile46.002013
Maximum67.510267
Range48.482845
Interquartile range (IQR)7.2161407

Descriptive statistics

Standard deviation5.1056039
Coefficient of variation (CV)0.13247897
Kurtosis0.77423362
Mean38.538976
Median Absolute Deviation (MAD)3.38992
Skewness-0.1880969
Sum71389368
Variance26.067191
MonotonicityNot monotonic
2025-04-30T10:54:59.775184image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33.849192 4
 
< 0.1%
41.301611 4
 
< 0.1%
40.456305 4
 
< 0.1%
40.16396 4
 
< 0.1%
40.550199 4
 
< 0.1%
41.113971 4
 
< 0.1%
38.974173 4
 
< 0.1%
40.87635 4
 
< 0.1%
40.148393 4
 
< 0.1%
41.731663 4
 
< 0.1%
Other values (1754147) 1852354
> 99.9%
ValueCountFrequency (%)
19.027422 1
< 0.1%
19.027785 1
< 0.1%
19.027804 1
< 0.1%
19.027849 1
< 0.1%
19.029798 1
< 0.1%
19.031242 1
< 0.1%
19.032277 1
< 0.1%
19.032689 1
< 0.1%
19.033288 1
< 0.1%
19.034282 1
< 0.1%
ValueCountFrequency (%)
67.510267 1
< 0.1%
67.441518 1
< 0.1%
67.397018 1
< 0.1%
67.188111 1
< 0.1%
67.064277 1
< 0.1%
66.835174 1
< 0.1%
66.682905 1
< 0.1%
66.679297 1
< 0.1%
66.674714 1
< 0.1%
66.67355 1
< 0.1%

merch_long
Real number (ℝ)

High correlation 

Distinct1809753
Distinct (%)97.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-90.22794
Minimum-166.67157
Maximum-66.950902
Zeros0
Zeros (%)0.0%
Negative1852394
Negative (%)100.0%
Memory size14.1 MiB
2025-04-30T10:55:00.046336image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum-166.67157
5-th percentile-119.30928
Q1-96.89944
median-87.440694
Q3-80.245108
95-th percentile-73.365169
Maximum-66.950902
Range99.720673
Interquartile range (IQR)16.654332

Descriptive statistics

Standard deviation13.759692
Coefficient of variation (CV)-0.15249924
Kurtosis1.8312584
Mean-90.22794
Median Absolute Deviation (MAD)8.2235005
Skewness-1.143933
Sum-1.6713769 × 108
Variance189.32913
MonotonicityNot monotonic
2025-04-30T10:55:00.242103image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-92.521318 4
 
< 0.1%
-87.830842 4
 
< 0.1%
-82.223196 4
 
< 0.1%
-81.219189 4
 
< 0.1%
-74.433003 4
 
< 0.1%
-95.822621 4
 
< 0.1%
-90.85685 4
 
< 0.1%
-80.893888 4
 
< 0.1%
-81.036745 4
 
< 0.1%
-87.621011 4
 
< 0.1%
Other values (1809743) 1852354
> 99.9%
ValueCountFrequency (%)
-166.671575 1
< 0.1%
-166.671242 1
< 0.1%
-166.670685 1
< 0.1%
-166.670132 1
< 0.1%
-166.670006 1
< 0.1%
-166.66991 1
< 0.1%
-166.669812 1
< 0.1%
-166.669638 1
< 0.1%
-166.666179 1
< 0.1%
-166.664828 1
< 0.1%
ValueCountFrequency (%)
-66.950902 1
< 0.1%
-66.952026 1
< 0.1%
-66.952352 1
< 0.1%
-66.955602 1
< 0.1%
-66.955996 1
< 0.1%
-66.95654 1
< 0.1%
-66.957364 1
< 0.1%
-66.958659 1
< 0.1%
-66.958751 1
< 0.1%
-66.959178 1
< 0.1%

is_fraud
Categorical

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
0
1842743 
1
 
9651

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1852394
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 1842743
99.5%
1 9651
 
0.5%

Length

2025-04-30T10:55:00.460366image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-30T10:55:00.633066image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
0 1842743
99.5%
1 9651
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0 1842743
99.5%
1 9651
 
0.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1852394
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1842743
99.5%
1 9651
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1852394
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1842743
99.5%
1 9651
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1852394
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1842743
99.5%
1 9651
 
0.5%

Interactions

2025-04-30T10:54:25.276876image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:47.111510image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:51.705366image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:56.271946image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:01.201998image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:06.245731image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:11.236989image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:16.091175image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:20.673832image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:25.868625image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:47.734057image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:52.170250image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:56.765745image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:01.821545image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:06.727700image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:11.891589image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:16.583636image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:21.223454image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:26.361263image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:48.228482image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:52.653433image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:57.245118image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:02.418005image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:07.244776image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:12.446656image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:17.083526image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:21.754092image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:26.864065image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:48.706669image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:53.139502image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:57.834810image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:02.916955image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:07.748221image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:12.992848image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:17.589264image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:22.302928image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:27.346173image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:49.176316image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:53.678375image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:58.375054image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:03.414798image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:08.265928image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:13.540077image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:18.082902image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:22.802976image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:27.892131image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:49.706363image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:54.242491image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:58.921144image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:03.937822image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:08.882416image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:14.009178image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:18.595001image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:23.300833image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:28.623705image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:50.192792image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:54.774853image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:59.489162image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:04.770921image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:09.427077image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:14.520271image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:19.125151image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:23.820995image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:29.168219image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:50.729960image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:55.255109image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:00.053007image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:05.266913image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:10.074148image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:15.062477image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:19.663987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:24.302005image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:29.691936image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:51.218785image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:53:55.750929image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:00.605576image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:05.759142image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:10.697910image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:15.615861image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:20.145674image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-30T10:54:24.790348image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Correlations

2025-04-30T10:55:00.765938image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
amtcategorycc_numcity_popgenderis_fraudlatlongmerch_latmerch_longunix_timezip
amt1.0000.019-0.001-0.0240.0010.0000.013-0.0000.013-0.000-0.0010.001
category0.0191.0000.0080.0140.0540.0670.0100.0090.0110.0090.0010.011
cc_num-0.0010.0081.0000.0490.0520.003-0.003-0.013-0.003-0.0130.0010.013
city_pop-0.0240.0140.0491.0000.0900.002-0.2640.087-0.2630.086-0.003-0.040
gender0.0010.0540.0520.0901.0000.0060.1010.0910.1030.0830.0000.116
is_fraud0.0000.0670.0030.0020.0061.0000.0380.0380.0380.0380.0220.004
lat0.0130.010-0.003-0.2640.1010.0381.0000.1050.9910.1040.001-0.162
long-0.0000.009-0.0130.0870.0910.0380.1051.0000.1050.998-0.001-0.959
merch_lat0.0130.011-0.003-0.2630.1030.0380.9910.1051.0000.1040.001-0.162
merch_long-0.0000.009-0.0130.0860.0830.0380.1040.9980.1041.000-0.001-0.957
unix_time-0.0010.0010.001-0.0030.0000.0220.001-0.0010.001-0.0011.0000.001
zip0.0010.0110.013-0.0400.1160.004-0.162-0.959-0.162-0.9570.0011.000

Missing values

2025-04-30T10:54:30.653957image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
A simple visualization of nullity by column.
2025-04-30T10:54:33.916496image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

trans_date_trans_timecc_nummerchantcategoryamtfirstlastgenderstreetcitystateziplatlongcity_popjobdobtrans_numunix_timemerch_latmerch_longis_fraud
02019-01-01 00:00:182703186189652095fraud_Rippin, Kub and Mannmisc_net4.970JenniferBanksF561 Perry CoveMoravian FallsNC2865436.079-81.1783495Psychologist, counselling1988-03-090b242abb623afc578575680df30655b9132537601836.011-82.0480
12019-01-01 00:00:44630423337322fraud_Heller, Gutmann and Ziemegrocery_pos107.230StephanieGillF43039 Riley Greens Suite 393OrientWA9916048.888-118.210149Special educational needs teacher1978-06-211f76529f8574734946361c461b024d99132537604449.159-118.1860
22019-01-01 00:00:5138859492057661fraud_Lind-Buckridgeentertainment220.110EdwardSanchezM594 White Dale Suite 530Malad CityID8325242.181-112.2624154Nature conservation officer1962-01-19a1a22d70485983eac12b5b88dad1cf95132537605143.151-112.1540
32019-01-01 00:01:163534093764340240fraud_Kutch, Hermiston and Farrellgas_transport45.000JeremyWhiteM9443 Cynthia Court Apt. 038BoulderMT5963246.231-112.1141939Patent attorney1967-01-126b849c168bdad6f867558c3793159a81132537607647.034-112.5610
42019-01-01 00:03:06375534208663984fraud_Keeling-Cristmisc_pos41.960TylerGarciaM408 Bradley RestDoe HillVA2443338.421-79.46399Dance movement psychotherapist1986-03-28a41d7549acf90789359a9aa5346dcb46132537618638.675-78.6320
52019-01-01 00:04:084767265376804500fraud_Stroman, Hudson and Erdmangas_transport94.630JenniferConnerF4655 David IslandDublinPA1891740.375-75.2042158Transport planner1961-06-19189a841a0a8ba03058526bcfe566aab5132537624840.653-76.1530
62019-01-01 00:04:4230074693890476fraud_Rowe-Vandervortgrocery_net44.540KelseyRichardsF889 Sarah Station Suite 624HolcombKS6785137.993-100.9892691Arboriculturist1993-08-1683ec1cc84142af6e2acf10c44949e720132537628237.163-100.1530
72019-01-01 00:05:086011360759745864fraud_Corwin-Collinsgas_transport71.650StevenWilliamsM231 Flores Pass Suite 720EdinburgVA2282438.843-78.6006018Designer, multimedia1947-08-216d294ed2cc447d2c71c7171a3d54967c132537630838.948-78.5400
82019-01-01 00:05:184922710831011201fraud_Herzog Ltdmisc_pos4.270HeatherChaseF6888 Hicks Stream Suite 954ManorPA1566540.336-79.6611472Public affairs consultant1941-03-07fc28024ce480f8ef21a32d64c93a29f5132537631840.352-79.9580
92019-01-01 00:06:012720830304681674fraud_Schoen, Kuphal and Nitzschegrocery_pos198.390MelissaAguilarF21326 Taylor Squares Suite 708ClarksvilleTN3704036.522-87.349151785Pathologist1974-03-283b9014ea8fb80bd65de0b1463b00b00e132537636137.179-87.4850
trans_date_trans_timecc_nummerchantcategoryamtfirstlastgenderstreetcitystateziplatlongcity_popjobdobtrans_numunix_timemerch_latmerch_longis_fraud
18523842020-12-31 23:57:1830344654314976fraud_Larkin, Stracke and Greenfelderentertainment46.710ChristineJohnsonF8011 Chapman Tunnel Apt. 568Blairsden-GraeagleCA9610339.813-120.6411725Chartered legal executive (England and Wales)1967-05-27a7105564935ea3977dc61ff9ced3bf5e138853423838.964-120.4570
18523852020-12-31 23:57:503524574586339330fraud_Heathcote, Yost and Kertzmannshopping_net29.560AshleyCabreraF94225 Smith Springs Apt. 617Vero BeachFL3296027.633-80.403105638Librarian, public1986-05-079fc9f6f9be3182d519a61a119cf97199138853427027.594-80.8550
18523862020-12-31 23:57:56341546199006537fraud_Schmidt-Larkinhome12.680MarkBrownM8580 Moore CoveWalesAK9978364.756-165.672145Administrator, education1939-11-09a8310343c189e4a5b6316050d2d6b014138853427665.624-165.1860
18523872020-12-31 23:58:04501802953619fraud_Pouros, Walker and Spencerkids_pets13.020RobertFloresM3277 Fields Meadows Apt. 790GreenviewCA9603741.540-122.937308Call centre manager1958-09-20bd7071fd5c9510a5594ee196368ac80e138853428441.973-123.5530
18523882020-12-31 23:58:343523843138706408fraud_Prosacco, Kreiger and Kovacekhome17.000GraceWilliamsF28812 Charles Mill Apt. 628PlantersvilleAL3675832.618-86.9481412Drilling engineer1970-11-206d04313bfe4b661b8ca2b6a499a320fe138853431432.164-87.5400
18523892020-12-31 23:59:0730560609640617fraud_Reilly and Sonshealth_fitness43.770MichaelOlsonM558 Michael EstatesLurayMO6345340.493-91.891519Town planner1966-02-139b1f753c79894c9f4b71f04581835ada138853434739.947-91.3330
18523902020-12-31 23:59:093556613125071656fraud_Hoppe-Parisiankids_pets111.840JoseVasquezM572 Davis MountainsLake JacksonTX7756629.039-95.44028739Futures trader1999-12-272090647dac2c89a1d86c514c427f5b91138853434929.661-96.1870
18523912020-12-31 23:59:156011724471098086fraud_Rau-Robelkids_pets86.880AnnLawsonF144 Evans Islands Apt. 683BurbankWA9932346.197-118.9023684Musician1981-11-296c5b7c8add471975aa0fec023b2e8408138853435546.658-119.7150
18523922020-12-31 23:59:244079773899158fraud_Breitenberg LLCtravel7.990EricPrestonM7020 Doyle Stream Apt. 951MesaID8364344.626-116.449129Cartographer1965-12-1514392d723bb7737606b2700ac791b7aa138853436444.471-117.0810
18523932020-12-31 23:59:344170689372027579fraud_Dare-Marvinentertainment38.130SamuelFreyM830 Myers Plaza Apt. 384EdmondOK7303435.666-97.480116001Media buyer1993-05-101765bb45b3aa3224b4cdcb6e7a96cee3138853437436.210-97.0360